For this report, I used the epuRate’s template of Yan Holtz who made it publicly available.
For the final report of course 2. I will present you a story about volcanoes. The data were found on the Tidy Tuesday where they deliver once a week some data about a topic to help R learners improve their skills. The direct link is here.
Who has never heard of volcanoes? I think that as a child we all heard, saw or even gave a talk about volcanoes at least once. But what do you really know about them? For me, it had always been synonymous with old mountains, dating back thousands of years, even before humanity. But are volcanoes all gone now? No, they haven’t, and you’d be surprised at the opposite. But first things first, what is exactly a volcano?
A volcano is a rupture in the crust of a planetary-mass object, such as Earth, that allows hot lava, volcanic ash, and gases to escape from a magma chamber below the surface. Earth’s volcanoes occur because its crust is broken into 17 major, rigid tectonic plates that float on a hotter, softer layer in its mantle. Therefore, on Earth, volcanoes are generally found where tectonic plates are diverging or converging, and most are found underwater.
Erupting volcanoes can pose many hazards, not only in the immediate vicinity of the eruption. One such hazard is that volcanic ash can be a threat to aircraft, in particular those with jet engines where ash particles can be melted by the high operating temperature; the melted particles then adhere to the turbine blades and alter their shape, disrupting the operation of the turbine. Large eruptions can affect temperature as ash and droplets of sulfuric acid obscure the sun and cool the Earth’s lower atmosphere (or troposphere); however, they also absorb heat radiated from the Earth, thereby warming the upper atmosphere (or stratosphere). Historically, volcanic winters have caused catastrophic famines.
Okay, now that we have everything, Let’s take a look at the number of eruptions that have occured around the world.
# Get the data
volcano <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-05-12/volcano.csv')
eruptions <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-05-12/eruptions.csv')
tree_rings <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-05-12/tree_rings.csv')
# Clean the data
volcano <- volcano %>%
mutate(last_eruption_year = as.numeric(last_eruption_year))
# Compute number of eruption + mean
number_eruption <- eruptions %>%
filter(eruption_category == "Confirmed Eruption") %>%
group_by(start_year) %>%
count() %>%
ungroup() %>%
mutate(mean_10 = zoo::rollmean(n, k = 50, fill = 0, align = "right"))# the rollmean function of the zoo package calculate a simple moving average of k years
# Plot 1
plot_eruptions <- ggplot(number_eruption, aes(x = start_year, y = n)) +
geom_line(alpha = 0.3) +
geom_line(aes(x = start_year, y = mean_10, color = "mean over 50 years")) +
scale_color_manual(name = "", values = "darkred") +
scale_x_continuous(limits = c(-8000, 2020)) +
ggthemes::theme_wsj() +
labs(title = "Confirmed volcanic eruptions",
x = "Year",
y = "Number of eruptions",
caption = "Source: Tidytuesday") +
theme(plot.title = element_text(size = 15),
axis.title = element_text(size = 10),
plot.caption = element_text(size = 8))
plot_eruptionsIt seems that the number of eruptions in recent century has exploded! (sorry for the pun).
# Number of eruptions since year 0
number_eruption_since_0 <- eruptions %>%
filter(eruption_category == "Confirmed Eruption") %>%
filter(start_year >= 0) %>%
group_by(start_year) %>%
count() %>%
ungroup() %>%
mutate(mean_10 = zoo::rollmean(n, k = 50, fill = 0, align = "right"))
# Plot 2
plot_eruptions_since_0 <- ggplot(number_eruption_since_0) +
geom_line(aes(x = start_year, y = n), alpha = 0.3) +
geom_line(aes(x = start_year, y = mean_10, color = "mean over 50 years")) +
ggthemes::theme_wsj() +
labs(title = "Confirmed volcanic eruptions since year 0",
caption = "Source: Tidytuesday",
x = "Year",
y = "Number of eruptions") +
theme(plot.title = element_text(size = 15),
axis.title = element_text(size = 10),
plot.caption = element_text(size = 8)) +
scale_x_continuous( limits = c(500, 2020)) +
scale_color_manual(name = "", values = "darkred")
plot_eruptions_since_0
But the last few years have been rather stable.
# Number of eruptions since year 2000
number_eruption_since_2000 <- eruptions %>%
filter(eruption_category == "Confirmed Eruption") %>%
filter(start_year >= 2000) %>%
group_by(start_year) %>%
count()
# Plot 3
plot_eruptions_since_2000 <- ggplot(number_eruption_since_2000) +
geom_col(aes(x = start_year, y = n), fill = "darkred", alpha = 0.8) +
ggthemes::theme_wsj() +
labs(title = "Confirmed volcanic eruptions since 2000",
x = "Year",
y = "Number of eruptions",
caption = "Source: Tidytuesday") +
theme(plot.title = element_text(size = 15),
axis.title = element_text(size = 10),
plot.caption = element_text(size = 8))
plot_eruptions_since_2000
Although an eruption is difficult to predict, some studies show a link with climate change. Indeed, researchers from the universities of Caltech, Cambridge, Geneva and ETH-Zurich explain an effect of climate change on the frequency of volcanic eruptions. Melting ice decreases the pressure deep in the Earth’s mantle where magmas are generated. As a feedback, this decrease in pressure could increase magma production at depth and lead to an increase in volcanic activity and emissions at the surface.
According to these data, there are 647 active volcanoes. But what is considered as an active volcano?
We consider that a volcano is no longer active when it has not erupted for 10’000 years. So these volcanoes have erupted at least once since 7980 BC.
“Okay, but when I think about active volcanoes, I think about really active ones. The ones that I was at least born when they erupted”
Alright. Let’s try to map the volcanoes that erupted in 2020 on a map.
#Import the shapes of the world and water
world <- st_read("data/naturalearth/ne_110m_admin_0_countries.shp")
water <- st_read("data/naturalearth/ne_110m_ocean.shp")
# Select volcano erupted in 2020
volcano_2020 <- volcano %>%
filter(!is.na(last_eruption_year),
last_eruption_year >= 2020)
# Plot the map
ggplot() +
geom_sf(data = world)+
geom_sf(data = water, fill = "lightblue") +
geom_point(data = volcano,
mapping = aes(x = longitude, y = latitude),
color = "#70a494",
size = .8) +
geom_point(data = volcano_2020,
mapping = aes(x = longitude, y = latitude),
size = 1,
color = "#ca562c") +
theme_void() +
labs(
title = "**Volcanoes location**",
subtitle = "Those that erupted in 2020 are in <span style=color:'red'>**red**</span>",
caption = "Source: tidytuesday",
fill = "" ) +
theme(plot.subtitle = ggtext::element_markdown(),
plot.title = ggtext::element_markdown())
Remember the definition? We can clearly see the tectonic plates from the location of the volcanoes. We can also see that a lot of them are underwater.
“And which one is the most active? The one that never stops erupting?”
Well, let’s see which one has been the most active in the last 10’000 years!
# Selecting data for table
most_active_volcano <- eruptions %>%
filter(start_year >= -7980) %>%
group_by(volcano_name) %>%
count() %>%
arrange(desc(n)) %>%
head(10) %>%
ungroup()
# Create table for most active volcanoes
most_active_volcano %>%
gt()%>%
cols_label(volcano_name = "Name", n = "number of eruptions") %>%
tab_header(md("<span style='color:#ca562c'>**Most active volcanoes**</span>"),
subtitle = "Top 10") %>%
tab_options(
column_labels.font.weight = "bold"
) %>%
opt_row_striping()| Most active volcanoes | |
|---|---|
| Top 10 | |
| Name | number of eruptions |
| Etna | 241 |
| Fournaise, Piton de la | 194 |
| Asosan | 185 |
| Villarrica | 164 |
| Asamayama | 147 |
| Katla | 132 |
| Klyuchevskoy | 111 |
| Mauna Loa | 109 |
| Merapi | 109 |
| Izu-Oshima | 106 |
“Yeah of course, Etna. And the biggest?”
# Selecting biggest volcanoes
biggest_volcano <- volcano %>%
arrange(desc(elevation)) %>%
head(5) %>%
select(volcano_name, region, country, last_eruption_year, elevation)
# Create table for biggest volcanoes
biggest_volcano %>%
gt()%>%
cols_label(volcano_name = "Name",
elevation = "height in meters",
last_eruption_year = "last active year") %>%
tab_header(md("<span style='color:#ca562c'>**Biggest volcanoes**</span>"),
subtitle = "Top 5") %>%
tab_options(
column_labels.font.weight = "bold"
) %>%
opt_row_striping()| Biggest volcanoes | ||||
|---|---|---|---|---|
| Top 5 | ||||
| Name | region | country | last active year | height in meters |
| Ojos del Salado, Nevados | South America | Chile-Argentina | 750 | 6879 |
| Parinacota | South America | Chile-Bolivia | 290 | 6336 |
| Pular | South America | Chile | NA | 6233 |
| San Pedro-San Pablo | South America | Chile | 1960 | 6142 |
| Aracar | South America | Argentina | NA | 6095 |
It seems that the highest volcanoes are all in South America.
Let’s take a look at what kind of volcano exists.
#Selecting volcanoes types
volcano_type <- volcano %>%
mutate(primary_volcano_type = case_when(str_detect(primary_volcano_type,"Stratovolcano") ~ "Stratovolcano",
str_detect(primary_volcano_type,"Shield") ~ "Shield",
str_detect(primary_volcano_type,"Caldera") ~ "Caldera",
TRUE ~ "Other")) %>%
group_by(primary_volcano_type) %>%
count() %>%
ungroup() %>%
arrange(desc(n)) %>%
mutate(primary_volcano_type = fct_inorder(primary_volcano_type))
# plot the bar chart
volcano_type %>%
ggplot(mapping = aes(x = primary_volcano_type, y = n, fill = primary_volcano_type)) +
geom_col() +
labs(title = "Types of Volcanoes",
x = "",
y = "number of volcanoes",
fill = "",
caption = "Source: Tidytuesday") +
scale_fill_manual(values = c("#F25F1C", "#CA7C4C", "#9C9590", "#9399A4")) +
ggthemes::theme_wsj() +
theme(axis.line.x = element_blank(),
legend.position = "",
axis.text.x = element_text(angle = 45, vjust = 1, hjust=1),
axis.title = element_text(size = 10),
plot.caption = element_text(size = 8))Okay We are almost finished! There is one last thing that caught my attention when I was researching about volcanoes. Apparently, the year of the eruptions could be approximatively defined with… tree rings! The age of a tree can be determined by the tree rings (the number of lines inside a tree trunk). When there is an eruption, this causes the air to cool down, and the tree “freeze”, which causes the tree line to shrink.
This is what I read on the site futurity.org.
“…A massive volcano such as Thera ejects so much material into the atmosphere that it cools the earth. For cold-climate trees such as Irish oaks and bristlecones, that exceptionally cold year shows up as a much narrower tree ring. Salzer’s work reveals at least four different years within the new radiocarbon age range for Thera where the bristlecone pines had exceptionally narrow rings that might indicate a huge volcanic eruption…”
It happens that we have the data to measure the correlation between the variation of the n_tree and the mean temperature of a particular year. The numbers are a bit technical but I could try to explain a bit what is it. The n_tree number is measured as a z-score, which is a measure of variability from the mean, either positive or negative. The mean temperature is measure as an index of Celsius degree.
# Create Scaterplot
tree_rings %>%
ggplot(mapping = aes(x = n_tree, y = europe_temp_index)) +
geom_point(color = "darkred") +
ggthemes::theme_wsj() +
labs(title = "Scaterplot of Tree ring z-scores vs temperature for Europe",
x = "Tree ring z-scores relative to year = 1000-1099",
y = "Pages 2K Temperature relative to 1961-1990",
caption = "Source: Tidytuesday") +
theme(plot.title = element_text(size = 13),
axis.title = element_text(size = 8),
plot.caption = element_text(size = 8))
Amazing ! The correlation is 0.79. It seems that when temperature are colder, the tree ring z-score is negative, which means that the tree rings are narrower to each other in comparison to the mean. Please note that the correlation doesn’t mean that the drop in temperature causes the tree rings to shrink. There are probably other variables influencing the result which we have not taken into account. Having said that, I find this discovery still interesting.
Well, that’s it. I hope you enjoyed it.
A work by Valentin Monney